Reply to this post | Go Back
View Post [edit]

Poster: stbalbach Date: Nov 30, 2009 11:19am

Forum: texts Subject: PDF's on Amazon Kindle

I recently bought an Amazon Kindle DX for reading Internet Archive PDF's - however the Kindle doesn't support most IA's scans (incompatible image type). Of course it's possible to convert PDF -> Mobi using the program Calibre, however a lot is lost - pictures, page layout and numbers, marginalia, the font and original look of the book - in other words, everything that makes reading scanned books so much better than plain text. I found a workaround to display PDF on the Kindle. Basically it involves converting the DjVu version to PDF. This page explains: http://www.aeonity.com/david/how-convert-djvu-files-pdf-free-in-bulk The solution requires the commercial Pro version of Adobe. But in the comments section some users report it working with a freeware replacement "BullZip" http://www.bullzip.com/download.php ..or a program called Infranview, which has plugins making it a single-program solution. I have not tried these. http://www.irfanview.com/ Anyway, I now can view the entire IA library on my Amazon Kindle, in PDF format, and am very happy. Stephen
This post was modified by stbalbach on 2009-11-30 19:19:13

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 26, 2012 11:08pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

Hello

Does this problem still exist with Kindle? If yes, are you sure it is due to incompatible image types? If not, please see below.

I have found that some pdf files produced by IA do not follow the PDF specification, and that causes them to fail with some programs (e.g., in pdflatex with the error ‘invalid font in reference type ’). I will write a post about this on the forum, and the solution, in due time, but for now I am looking for other instances of the failure.

Could the problem in Kindle be due to the same bug? Can you please check? The pdf file from the below mentioned page (Lyon’s Grammar) fails with pdflatex with the error mentioned below. If you are not sure about the source of the Kindle problem, can you please check (i) that it fails with Kindle too, and (ii) the solution I propose works.

(ii): To correct -- which is a hack (for now) -- replace:

00006 0 obj
<< /F2 << /Type /Font /Subtype /Type1 /Encoding << /Differences 5 0 R >> /BaseFont /Times-Roman >>
/F2B << /Type /Font /Subtype /Type1 /Encoding << /Differences 5 0 R >> /BaseFont /Times-Bold >>
/F2I << /Type /Font /Subtype /Type1 /Encoding << /Differences 5 0 R >> /BaseFont /Times-Italic >>
/F3 << /Type /Font /Subtype /Type1 /Encoding << /Differences 5 0 R >> /BaseFont /Courier >>
>>
endobj

with:

00006 0 obj
<<>>
endobj

in the pdf file with a hexeditor or a good text editor. (Essentially, just delete everything between the two outer-most angle marks.) This process corrupts the xref table, but pdf readers should be able to digest the file anyway. (perhaps with a warning saying that “xref table is corrupted”.)

I have attached two versions of a page of the book. (It is enough to check the error with one page.) p.30-orig.pdf is the original page. p.30-edited.pdf is the page with the above mentioned required edit. There are no viruses in the files. You could check by just using these two files (70 KB each).

Thanks for your help.

http://archive.org/details/analysissevenpa00unkngoog

Attachment: p.30-orig.pdf
Attachment: p.30-edited.pdf

Reply to this post
Reply [edit]

Poster: stbalbach Date: Dec 28, 2012 11:40pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

> Could the problem in Kindle be due to the same bug?

Recall it's because the Kindle DX doesn't support layered PDF documents.

I downloaded and opened the p.30's and they both display correctly, perhaps because the original is not layered.

Try this one: http://archive.org/details/devisesetembleme00lafeu

On downloading the PDF and opening in the DX it says "some elements on this page can not be displayed" and every page is blank (ie. 100% elements can't be displayed). If you want, make changes mentioned above and I'll try it again (I don't have the tools to edit a binary file).

Stephen

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 31, 2012 6:32am

Forum: texts Subject: Re: PDF's on Amazon Kindle

First about the file.

Every image that we see in this is made up from 3 images: a background, a foreground and a mask. So the pdf really contains not 116, but 348 images. The foreground and background are in RGB, the mask is BW (1 bit per sample).

To see why this is useful, imagine brown text on yellow page. If such a 1000x1000 sample image is saved in RGB (3 bytes per sample), it would before compression take 3,000,000 bytes. In the IA method the image is decomposed into 3 parts: a brown 1000x1000 image, a yellow 1000x1000 image, and a BW 1000x1000 mask, and each of them is compressed and saved separately in the PDF. It is the reader’s work to join them together. '0' in the mask image means the reader is to compose the final image by using the corresponding pixel of the foreground image; '1' means that the reader is to use the corresponding pixel of the background image.

This helps as the three can be compressed much better. At an extreme, you can always imagine the brown and the yellow images taking just a few bytes each: 3 bytes for recording the colour, and a few bytes more for recording the dimensions of the image. Also, our most critical data is in the mask -- that image has to be the sharpest. But that has now 1 bit samples, and not the 27 bytes we earlier had.

There are a few more details. First, the background image is saved at a lower resolution, and the reader is asked to interpolate. So a little, if insignificant, loss in quality is creeping in the IA "compression". Second, the background and the foreground images are not limited to one colour -- they can be full-fledged RGB images too. The point is that the background image will fill surrounding colour in the place where the (foreground) objects are -- those parts will not be used anyway -- and thus will have a more or less uniform colour or gradient. And similarly, the foreground image will fill the nearby colours in the place in which it will not be read. This way the compression is much better. The critical stuff is (i) identifying the foreground and background properly, (ii) filling colours in areas which will not be read so that the image can be compressed best by the compression method of choice ('JPXDecode' for the RGB images in IA files). IA’s PDF files are produced by 'LuraDocument PDF v2.28'.

The compression is pretty significant. In the attached page 2 of the Devises et Emblemes file, the three images together take 42 KB. The dimensions of the composed image, however, are 2201x3063, so RGB (3 bytes per sample) would take 19 MB! This is the size of the file you will get when you join the three together in the intended manner.

So, most likely, Kindle is refusing to do this work. Please check for the text-layer issue too. The attached orig-p.2.pdf is the p. 2 of the book. The minus-textlayer.pdf is that page minus the text layer. (Both the files are of 40 KB size.) I am assuming that your Kindle can read neither of the files.

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 31, 2012 6:38am

Forum: texts Subject: Re: PDF's on Amazon Kindle

I forgot to add the files in the previous post.

Attachment: orig-p.2.pdf
Attachment: minus-textlayer.pdf

Reply to this post
Reply [edit]

Poster: stbalbach Date: Jan 1, 2013 2:19pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

I tried both and same error about unable to display elements and shows a blank page. You are probably correct the problem is these readers don't have the CPU for the compression and/or memory.

Reply to this post
Reply [edit]

Poster: aibek Date: Jan 5, 2013 7:52pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

Btw, the pdf reader is supposed to paint two images, one on the top of another: the background image in the background, and the foreground+mask image (i.e. a cut-out of the foregound image) over it. So it is really due to the presence of layers of images as you initially suspected.

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 29, 2012 9:44pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

Since you can open the p.30-orig file in Kindle, Kindle does not share the pdftex/pdflatex problem which I was talking about.

Let me investigate the properties of the Devices et Emblems file.

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 26, 2012 11:57pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

For completion, the p.30-edited above has had its xref errors + another inconsequential error fixed using pdftk. Another file with just the change I mentioned above (i.e., some stuff deleted) is attached with this mail, with the name p.30-edited-xref-notfixed.pdf.

Attachment: p.30-edited-xref-notfixed.pdf

Internet Archive Audio

Featured

Top

Images

Featured

Top

Software

Featured

Top

Books

Featured

Top

Video

Featured

Top

Mobile Apps

Browser Extensions

Archive-It Subscription

Save Page Now

Reply to this post | Go Back
View Post [edit]

Poster: stbalbach Date: Nov 30, 2009 11:19am

Forum: texts Subject: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 26, 2012 11:08pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: stbalbach Date: Dec 28, 2012 11:40pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 31, 2012 6:32am

Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 31, 2012 6:38am

Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: stbalbach Date: Jan 1, 2013 2:19pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: aibek Date: Jan 5, 2013 7:52pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 29, 2012 9:44pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 26, 2012 11:57pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

Poster:	stbalbach	Date:	Nov 30, 2009 11:19am
Forum:	texts	Subject:	PDF's on Amazon Kindle

Poster:	aibek	Date:	Dec 26, 2012 11:08pm
Forum:	texts	Subject:	Re: PDF's on Amazon Kindle

Internet Archive Audio

Featured

Top

Images

Featured

Top

Software

Featured

Top

Books

Featured

Top

Video

Featured

Top

Mobile Apps

Browser Extensions

Archive-It Subscription

Save Page Now

Reply to this post | Go Back View Post [edit]

Poster: stbalbach Date: Nov 30, 2009 11:19am Forum: texts Subject: PDF's on Amazon Kindle

Reply to this post Reply [edit]

Poster: aibek Date: Dec 26, 2012 11:08pm Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post Reply [edit]

Poster: stbalbach Date: Dec 28, 2012 11:40pm Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post Reply [edit]

Poster: aibek Date: Dec 31, 2012 6:32am Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post Reply [edit]

Poster: aibek Date: Dec 31, 2012 6:38am Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post Reply [edit]

Poster: stbalbach Date: Jan 1, 2013 2:19pm Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post Reply [edit]

Poster: aibek Date: Jan 5, 2013 7:52pm Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post Reply [edit]

Poster: aibek Date: Dec 29, 2012 9:44pm Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post Reply [edit]

Poster: aibek Date: Dec 26, 2012 11:57pm Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post | Go Back
View Post [edit]

Poster: stbalbach Date: Nov 30, 2009 11:19am

Forum: texts Subject: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 26, 2012 11:08pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: stbalbach Date: Dec 28, 2012 11:40pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 31, 2012 6:32am

Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 31, 2012 6:38am

Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: stbalbach Date: Jan 1, 2013 2:19pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: aibek Date: Jan 5, 2013 7:52pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 29, 2012 9:44pm

Forum: texts Subject: Re: PDF's on Amazon Kindle

Reply to this post
Reply [edit]

Poster: aibek Date: Dec 26, 2012 11:57pm

Forum: texts Subject: Re: PDF's on Amazon Kindle